A Study of Reuse and Plagiarism in Speech and Natural Language Processing papers
نویسندگان
چکیده
The aim of this experiment is to present an easy way to compare fragments of texts in order to detect (supposed) results of copy & paste operations between articles in the domain of Natural Language Processing, including Speech Processing (NLP). The search space of the comparisons is a corpus labelled as NLP4NLP, which includes 34 different sources and gathers a large part of the publications in the NLP field over the past 50 years. This study considers the similarity between the papers of each individual source and the complete set of papers in the whole corpus, according to four different types of relationship (self-reuse, self-plagiarism, reuse and plagiarism) and in both directions: a source paper borrowing a fragment of text from another paper of the collection, or in the reverse direction, fragments of text from the source paper being borrowed and inserted in another paper of the collection.
منابع مشابه
A Study of Reuse and Plagiarism in LREC papers
The aim of this experiment is to present an easy way to compare fragments of texts in order to detect (supposed) results of copy & paste operations between articles in the domain of Natural Language Processing (NLP). The search space of the comparisons is a corpus labeled as NLP4NLP gathering a large part of the NLP field. The study is centered on LREC papers in both directions, first with an L...
متن کاملPlagiarism checker for Persian (PCP) texts using hash-based tree representative fingerprinting
With due respect to the authors’ rights, plagiarism detection, is one of the critical problems in the field of text-mining that many researchers are interested in. This issue is considered as a serious one in high academic institutions. There exist language-free tools which do not yield any reliable results since the special features of every language are ignored in them. Considering the paucit...
متن کاملPlagiarism Detection Algorithm Using Natural Language Processing Based on Grammar Analyzing
Plagiarism has become one of the most concerned problems since there are several kinds of plagiarism that are hard to detect. Extrinsic plagiarism is now being handled well, but intrinsic plagiarism is not. Intrinsic plagiarism detection is being distracted by the mixed up structure and the using of another word which have the same meaning. Several methods have been research to handle this prob...
متن کاملAn introduction to the examples of scientific plagiarism and its identification soft-wares
Background: Increasing Immorality and Plagiarism in the country's higher education system has become a serious crisis. Hence, the purpose of this study was to analyze the Examples of Plagiarism and the introduction of Plagiarism detection software. Method: The present study is a narrative review study. Articles in Persian and Latin related to the use of scientific theft key words in databases w...
متن کاملمقایسه روش های طیفی برای شناسایی زبان گفتاری
Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016